

**We claim:**

1        1. A system for enciphering information, comprising:  
2            a first memory access unit configured to retrieve data from a host memory;  
3            a staggered FIFO unit configured to receive the retrieved data from the host memory, the  
4            staggered FIFO unit further configured to perform a ShiftRow step of an Advanced Encryption  
5            Standard (AES) algorithm on the data to produce row-shifted data;  
6            a second memory access unit configured to receive the produced row-shifted data and  
7            perform a byte substitution using the row-shifted data to produce byte-substituted data;  
8            logic configured to receive the byte-substituted data and expand the byte-substituted data  
9            to produce manipulated data using a designated expansion algorithm; and  
10            a subprocessor memory configured to receive and store the manipulated data.

1        2. A system for enciphering information, comprising:  
2            a host processor comprising a host memory, the host memory having data;  
3            a subprocessor having a subprocessor memory, the subprocessor configured to retrieve  
4            the data from the host memory and manipulate the data as the data is being loaded into the  
5            subprocessor memory.

1        3. The system of claim 2, wherein the subprocessor is configured to execute an  
2            Advanced Encryption Standard (AES) algorithm using the data.

1       4. The system of claim 3, wherein the subprocessor comprises a first memory access  
2 unit configured to retrieve the data from a host memory.

1       5. The system of claim 4, wherein the subprocessor further comprises a staggered  
2 FIFO unit configured to receive the retrieved data from the host memory, the staggered FIFO  
3 unit further configured to perform a ShiftRow step of the AES algorithm on the data to produce  
4 row-shifted data.

1       6. The system of claim 5, wherein the subprocessor further comprises a second  
2 memory access unit configured to receive the produced row-shifted data and perform a byte  
3 substitution operation on the row-shifted data using a substitution table to produce byte-  
4 substituted data.

1       7. The system of claim 6, wherein the subprocessor further comprises a data  
2 expansion unit configured to receive the byte-substituted data and expand the byte-substituted  
3 data to produce manipulated data using a designated expansion algorithm.

1       8. The system of claim 7, wherein the subprocessor further comprises a subprocessor  
2 memory configured to receive and store the manipulated data.

1       9. The system of claim 8, wherein the subprocessor further comprises a subprocessor  
2 memory access unit configured to retrieve the stored data.

1        10.     The system of claim 5, wherein the staggered FIFO unit comprises:  
2              a first set of variable-layer FIFOs having a plurality of FIFO layers, the first set of  
3              variable-layer FIFOs configured to receive a first portion of the data in response to a designated  
4              clock signal, the first set of variable-layer FIFOs further configured to cascade the received first  
5              portion of the data through the plurality of FIFO layers in response to consecutive clock signals  
6              after the designated clock signal, the variable-layer FIFOs further configured to release the  
7              cascaded first portion of the data; and

8              a second set of variable-layer FIFOs having a plurality of FIFO layers, the second set of  
9              variable-layer FIFOs configured to receive a second portion of the data in response to the  
10             designated clock signal, the second set of variable-layer FIFOs further configured to cascade the  
11             received second portion of the data through the plurality of FIFO layers in response to the  
12             consecutive clock signals after the designated clock signal, the second set of variable-layer  
13             FIFOs further configured to buffer the cascaded second portion of the data for one clock cycle,  
14             the variable-layer FIFOs further configured to release the buffered second portion of the data.

1        11.     The system of claim 10, wherein the data comprises a matrix having four rows of  
2        data, each of the four rows of data having four bytes of data.

1        12.     The system of claim 11, wherein the first set of variable-layer FIFOs comprises a  
2        first release-FIFO layer configured to receive a first set of three bytes of the data matrix in  
3        response to a designated clock signal, the first release-FIFO layer further configured to release  
4        the first set of three bytes of the data matrix in response to a first clock signal, the first clock  
5        signal immediately following the designated clock signal, the first release-FIFO layer further

6 configured to receive a second set of three bytes of the data matrix in response to the first clock  
7 signal, the first release-FIFO layer further configured to release the second set of three bytes of  
8 the data matrix in response to a second clock signal, the second clock signal immediately  
9 following the first clock signal, the first release-FIFO layer further configured to receive a third  
10 set of three bytes in response to the second clock signal, the first release-FIFO layer further  
11 configured to release the third set of three bytes in response to a third clock signal, the third  
12 clock signal immediately following the second clock signal, the first release-FIFO layer further  
13 configured to receive a fourth set of three bytes in response to the third clock signal, the first  
14 release-FIFO layer further configured to release the fourth set of three bytes in response to a  
15 fourth clock signal, the fourth clock signal immediately following the third clock signal.

1       13.     The system of claim 12, wherein the first set of variable-layer FIFOs further  
2 comprises a second release-FIFO layer configured to receive two bytes of the first set of three  
3 bytes in response to the first clock signal, the second release-FIFO layer further configured to  
4 release the two bytes of the first set of three bytes in response to the second clock signal, the  
5 second release-FIFO layer further configured to receive two bytes of the second set of three  
6 bytes in response to the second clock signal, the second release-FIFO layer further configured to  
7 release the two bytes of the second set of three bytes in response to the third clock signal, the  
8 second release-FIFO layer further configured to receive two bytes of the third set of three bytes  
9 in response to the third clock signal, the second release-FIFO layer further configured to release  
10 the two bytes of the third set of three bytes in response to the fourth clock signal, the second  
11 release-FIFO layer further configured to receive two bytes of the fourth set of three bytes in  
12 response to the fourth clock signal, the second release-FIFO layer further configured to release

13 the two bytes of the fourth set of three bytes in response to a fifth clock signal, the fifth clock  
14 signal immediately following the fourth clock signal.

1 14. The system of claim 13, wherein the first set of variable-layer FIFOs further  
2 comprises a third release-FIFO layer configured to receive one byte of the two bytes of the first  
3 set of three bytes in response to the second clock signal, the third release-FIFO layer further  
4 configured to release the one byte of the two bytes of the first set of three bytes in response to the  
5 third clock signal, the third release-FIFO layer further configured to receive one byte of the two  
6 bytes of the second set of three bytes in response to the third clock signal, the third release-FIFO  
7 layer further configured to release the one byte of the two bytes of the second set of three bytes  
8 in response to the fourth clock signal, the third release-FIFO layer further configured to receive  
9 one byte of the two bytes of the third set of three bytes in response to the fourth clock signal, the  
10 third release-FIFO layer further configured to release the one byte of the two bytes of the third  
11 set of three bytes in response to the fifth clock signal, the third release-FIFO layer further  
12 configured to receive one byte of the two bytes of the fourth set of three bytes in response to the  
13 fifth clock signal, the third release-FIFO layer further configured to release the one byte of the  
14 two bytes of the fourth set of three bytes in response to a sixth clock signal, the sixth clock signal  
15 immediately following the fifth clock signal.

1 15. The system of claim 14, wherein the second set of variable-layer FIFOs comprises  
2 a first delay-FIFO layer configured to receive a second set of one byte of the data matrix in  
3 response to the first clock signal, the first delay-FIFO layer further configured to release the  
4 second set of one byte of the data matrix in response to the second clock signal, the first delay-

5 FIFO layer further configured to receive a third set of one byte of the data matrix in response to  
6 the second clock signal, the first delay-FIFO layer further configured to release the third set of  
7 one byte of the data matrix in response to the third clock signal, the first delay-FIFO layer further  
8 configured to receive a fourth set of one byte in response to the third clock signal, the first delay-  
9 FIFO layer further configured to hold the fourth set of one byte in response to a fourth clock  
10 signal, the first delay-FIFO layer further configured to release the fourth set of one byte in  
11 response to the fifth clock signal.

1 16. The system of claim 15, wherein the second set of variable-layer FIFOs further  
2 comprises a second delay-FIFO layer configured to receive, in response to the second clock  
3 signal, the second set of one byte of the data matrix and one byte of the first set of three bytes  
4 from the first release-FIFO layer, the second delay-FIFO layer further configured to release, in  
5 response to the third clock signal, the second set of one byte of the data matrix and the one byte  
6 of the first set of three bytes from the first release-FIFO layer, the second delay-FIFO layer  
7 further configured to receive, in response to the third clock signal, the third set of one byte of the  
8 data matrix and one byte of the second set of three bytes from the first release-FIFO layer, the  
9 second delay-FIFO layer further configured to hold, in response to the fourth clock signal, the  
10 third set of one byte of the data matrix and the one byte of the second set of three bytes from the  
11 first release-FIFO layer, the second delay-FIFO layer further configured to release, in response to  
12 the fifth clock signal, the third set of one byte of the data matrix and the one byte of the second  
13 set of three bytes from the first release-FIFO layer, the second delay-FIFO layer further  
14 configured to receive, in response to the fifth clock signal, the fourth set of one byte of the data  
15 matrix and one byte of the third set of three bytes from the first release-FIFO layer, the second

16 delay-FIFO layer further configured to release, in response to the sixth clock signal, the fourth  
17 set of one byte of the data matrix and the one byte of the third set of three bytes from the first  
18 release-FIFO layer.

1           17. The system of claim 16, wherein the second set of variable-layer FIFOs further  
2 comprises a third delay-FIFO layer configured to receive, in response to the third clock signal,  
3 the second set of one byte of the data matrix and two bytes of the first set of three bytes from the  
4 first release-FIFO layer, the third delay-FIFO layer further configured to hold, in response to the  
5 fourth clock signal, the second set of one byte of the data matrix and the two bytes of the first set  
6 of three bytes from the first release-FIFO layer, the third delay-FIFO layer further configured to  
7 release, in response to the fifth clock signal, the second set of one byte of the data matrix and the  
8 two bytes of the first set of three bytes from the first release-FIFO layer, the third delay-FIFO  
9 layer further configured to receive, in response to the fifth clock signal, the third set of one byte  
10 of the data matrix and two bytes of the second set of three bytes from the first release-FIFO  
11 layer, the third delay-FIFO layer further configured to release, in response to the sixth clock  
12 signal, the third set of one byte of the data matrix and the two bytes of the second set of three  
13 bytes from the first release-FIFO layer, the third delay-FIFO layer further configured to output,  
14 in response to the sixth clock signal, the fourth set of one byte of the data matrix and the two  
15 bytes of the third set of three bytes from the first release-FIFO layer.

1           18. The system of claim 6, wherein the substitution table is located in the host  
2 memory.

1        19. The system of claim 6, wherein the substitution table is configured as a hardware  
2 logic component within the subprocessor.

1        20. The system of claim 6, wherein the substitution table is located in the  
2 subprocessor memory.

1        21. A system for enciphering information, comprising:  
2            means for retrieving data from a host memory;  
3            means for performing a ShiftRow operation of an Advanced Encryption Standard (AES)  
4 algorithm on the data to produce row-shifted data;  
5            means for performing a byte substitution using the row-shifted data to produce byte-  
6 substituted data;  
7            means for expanding the byte-substituted data to produce manipulated data using a  
8 designated expansion algorithm; and  
9            means for storing the manipulated data.

1        22. A system for enciphering information, comprising:  
2            means for retrieving data from the host memory; and  
3            means for manipulating the data as the data is being loaded into a subprocessor memory.

1        23. The system of claim 22, wherein the means for manipulating the data further  
2 comprises means for executing an Advanced Encryption Standard (AES) algorithm using the  
3 data.

1        24.     The system of claim 23, wherein the means for executing the AES algorithm  
2     further comprises means for retrieving data from a host memory.

1        25.     The system of claim 24, further comprising means for performing a ShiftRow step  
2     of the AES algorithm on the data to produce row-shifted data.

1        26.     The system of claim 25, further comprising means for performing a byte  
2     substitution operation on the row-shifted data using a substitution table to produce byte-  
3     substituted data.

1        27.     The system of claim 26, further comprising means for expanding the byte-  
2     substituted data to produce manipulated data using a designated expansion algorithm.

1        28.     The system of claim 27, further comprising means for storing the manipulated  
2     data.

1        29.     The system of claim 28, further comprising means for retrieving the stored data.

1        30.     The system of claim 25, wherein the means for performing the ShiftRow step  
2     further comprises:

3              means for receiving a portion of the data in response to a designated clock signal;  
4              means for cascading the received portion of the data through a first set of FIFOs in  
5     response to consecutive clock signals after the designated clock signal;

6 means for releasing the cascaded portion of the data;  
7 means for receiving a remaining portion of the data in response to the designated clock  
8 signal;  
9 means for cascading the received remaining portion of the data through a second set of  
10 FIFOs in response to the consecutive clock signals after the designated clock signal;  
11 means for buffering the cascaded remaining portion of the data for one clock cycle; and  
12 means for releasing the buffered remaining portion of the data.

CROSS-REFERENCED PATENTS

1 31. A method for enciphering information, comprising the steps of:  
2 retrieving data from a host memory;  
3 performing a ShiftRow operation of an Advanced Encryption Standard (AES) algorithm  
4 on the data to produce row-shifted data;  
5 performing a byte substitution using the row-shifted data to produce byte-substituted  
6 data;  
7 expanding the byte-substituted data to produce manipulated data using a designated  
8 expansion algorithm; and  
9 storing the manipulated data.

1 32. A method for enciphering information, comprising the steps of:  
2 retrieving data from the host memory; and  
3 manipulating the data as the data is being loaded into a subprocessor memory.

1       33.     The method of claim 32, wherein the step of manipulating the data comprises the  
2     step of executing the Advanced Encryption Standard (AES) algorithm using the data.

1       34.     The method of claim 33, wherein the step of executing the AES algorithm  
2     comprises the step of performing a ShiftRow step of the AES algorithm on the data to produce  
3     row-shifted data.

1       35.     The method of claim 34, further comprising the step of performing a byte  
2     substitution operation on the row-shifted data using a substitution table to produce byte-  
3     substituted data.

1       36.     The method of claim 35, further comprising the step of expanding the byte-  
2     substituted data to produce manipulated data using a designated expansion algorithm.

1       37.     The method of claim 36, further comprising the step of storing the manipulated  
2     data in subprocessor memory.

1       38.     The method of claim 37, further comprising the step of retrieving the data stored  
2     in subprocessor memory.

1       39.     The method of claim 34, wherein the step of performing the ShiftRow operation  
2     further comprises:  
3           receiving a first portion of the data in response to a designated clock signal;

4 cascading the received first portion of the data through a first set of FIFOs in response to  
5 consecutive clock signals after the designated clock signal;  
6 releasing the cascaded first portion of the data;  
7 receiving a second portion of the data in response to the designated clock signal;  
8 cascading the received second portion of the data through a second set of FIFOs in  
9 response to the consecutive clock signals after the designated clock signal;  
10 buffering the cascaded second portion of the data for one clock cycle; and  
11 releasing the buffered second portion of the data.

1 40. The method of claim 39, further comprising the step of arranging the data into a  
2 matrix having four rows, each of the four rows having four bytes.

1 41. The method of claim 40, wherein the step of cascading the received first portion  
2 of the data through the first set of FIFOs comprises the steps of:

3 receiving a first set of three bytes of the data matrix in response to a designated clock  
4 signal;

5 releasing the first set of three bytes of the data matrix in response to a first clock signal,  
6 the first clock signal immediately following the designated clock signal;

7 receiving a second set of three bytes of the data matrix in response to the first clock  
8 signal;

9 releasing the second set of three bytes of the data matrix in response to a second clock  
10 signal, the second clock signal immediately following the first clock signal;

11 receiving a third set of three bytes in response to the second clock signal;

12           releasing the third set of three bytes in response to a third clock signal, the third clock  
13       signal immediately following the second clock signal;  
14           receiving a fourth set of three bytes in response to the third clock signal; and  
15           releasing the fourth set of three bytes in response to a fourth clock signal, the fourth clock  
16       signal immediately following the third clock signal.

1           42.       The method of claim 41, wherein the step of cascading the received first portion  
2       of the data through the first set of FIFOs further comprises the steps of:  
3           receiving two bytes of the first set of three bytes in response to the first clock signal;  
4           releasing the two bytes of the first set of three bytes in response to the second clock  
5       signal;  
6           receiving two bytes of the second set of three bytes in response to the second clock  
7       signal;  
8           releasing the two bytes of the second set of three bytes in response to the third clock  
9       signal;  
10          receiving two bytes of the third set of three bytes in response to the third clock signal;  
11          releasing the two bytes of the third set of three bytes in response to the fourth clock  
12       signal;  
13          receiving two bytes of the fourth set of three bytes in response to the fourth clock signal;  
14       and  
15          releasing the two bytes of the fourth set of three bytes in response to a fifth clock signal,  
16       the fifth clock signal immediately following the fourth clock signal.

1       43.     The method of claim 42, wherein the step of cascading the received first portion

2     of the data through the first set of FIFOs further comprises the steps of:

3           receiving one byte of the two bytes of the first set of three bytes in response to the second

4     clock signal;

5           releasing the one byte of the two bytes of the first set of three bytes in response to the

6     third clock signal;

7           receiving one byte of the two bytes of the second set of three bytes in response to the

8     third clock signal;

9           releasing the one byte of the two bytes of the second set of three bytes in response to the

10    fourth clock signal;

11          receiving one byte of the two bytes of the third set of three bytes in response to the fourth

12    clock signal;

13          releasing the one byte of the two bytes of the third set of three bytes in response to the

14    fifth clock signal;

15          receiving one byte of the two bytes of the fourth set of three bytes in response to the fifth

16    clock signal; and

17          releasing the one byte of the two bytes of the fourth set of three bytes in response to a

18    sixth clock signal, the sixth clock signal immediately following the fifth clock signal.

1       44.     The method of claim 43, wherein the step of cascading the received second

2     portion of the data through the second set of FIFOs comprises the steps of:

3           receiving a second set of one byte of the data matrix in response to the first clock signal;

4           releasing the second set of one byte of the data matrix in response to the second clock  
5       signal;  
6           receiving a third set of one byte of the data matrix in response to the second clock signal;  
7           releasing the third set of one byte of the data matrix in response to the third clock signal;  
8           receiving a fourth set of one byte in response to the third clock signal;  
9           holding the fourth set of one byte in response to the fourth clock signal; and  
10          releasing the fourth set of one byte in response to the fifth clock signal.

PCT/EP2007/000337

1           45.     The method of claim 44, wherein the step of cascading the received second  
2       portion of the data through the second set of FIFOs further comprises the steps of:  
3           receiving, in response to the second clock signal, the second set of one byte of the data  
4       matrix and one byte of the first set of three bytes;  
5           releasing, in response to the third clock signal, the second set of one byte of the data  
6       matrix and the one byte of the first set of three bytes;  
7           receiving, in response to the third clock signal, the third set of one byte of the data matrix  
8       and one byte of the second set of three bytes;  
9           holding, in response to the fourth clock signal, the third set of one byte of the data matrix  
10      and the one byte of the second set of three bytes;  
11           releasing, in response to the fifth clock signal, the third set of one byte of the data matrix  
12      and the one byte of the second set of three bytes;  
13           receiving, in response to the fifth clock signal, the fourth set of one byte of the data  
14      matrix and one byte of the third set of three bytes; and

15           releasing, in response to the sixth clock signal, the fourth set of one byte of the data  
16       matrix and the one byte of the third set of three bytes .

1           46.      The method of claim 45, wherein the step of cascading the received second

2       portion of the data through the second set of FIFOs further comprises the steps of:

3           receiving, in response to the third clock signal, the second set of one byte of the data

4       matrix and two bytes of the first set of three bytes;

5           holding, in response to the fourth clock signal, the second set of one byte of the data

6       matrix and two bytes of the first set of three bytes;

7           releasing, in response to the fifth clock signal, the second set of one byte of the data

8       matrix and two bytes of the first set of three bytes;

9           receiving, in response to the fifth clock signal, the third set of one byte of the data matrix

10       and two bytes of the second set of three bytes;

11           releasing, in response to the sixth clock signal, the third set of one byte of the data matrix

12       and two bytes of the second set of three bytes; and

13           output, in response to the sixth clock signal, the fourth set of one byte of the data matrix

14       and two bytes of the third set of three bytes .

1           47.      The method of claim 35, wherein the step of performing the byte substitution

2       operation further comprises the step of accessing a substitution table located in the host memory.

1       48. The method of claim 35, wherein the step of performing the byte substitution  
2 operation further comprises the step of accessing a substitution table configured as a hardware  
3 logic component within the subprocessor.

1       49. The method of claim 35, wherein the step of performing the byte substitution  
2 operation further comprises the step of accessing a substitution table located in the subprocessor  
3 memory.